25 research outputs found
Numerical methods for least squares problems with application to data assimilation
The Levenberg-Marquardt algorithm (LM) is one of the most popular algorithms for the solution of nonlinear least squares problems. Motivated by the problem structure in data assimilation, we consider in this thesis the extension of the LM algorithm to the scenarios where the linearized least squares subproblems, of the form min||Ax - b ||^2, are solved inexactly and/or the gradient model is noisy and accurate only within a certain probability. Under appropriate assumptions, we show that the modified algorithm converges globally and almost surely to a first order stationary point. Our approach is applied to an instance in variational data assimilation where stochastic models of the gradient are computed by the so-called ensemble Kalman smoother (EnKS). A convergence proof in L^p of EnKS in the limit for large ensembles to the Kalman smoother is given. We also show the convergence of LM-EnKS approach, which is a variant of the LM algorithm with EnKS as a linear solver, to the classical LM algorithm where the linearized subproblem is solved exactly. The sensitivity of the trucated sigular value decomposition method to solve the linearized subprobems is studied. We formulate an explicit expression for the condition number of the truncated least squares solution. This expression is given in terms of the singular values of A and the Fourier coefficients of b
Méthodes numériques pour les problèmes des moindres carrés, avec application à l'assimilation de données
L'algorithme de Levenberg-Marquardt (LM) est parmi les algorithmes les plus populaires pour la résolution des problèmes des moindres carrés non linéaire. Motivés par la structure des problèmes de l'assimilation de données, nous considérons dans cette thèse l'extension de l'algorithme LM aux situations dans lesquelles le sous problème linéarisé, qui a la forme min||Ax - b ||^2, est résolu de façon approximative, et/ou les données sont bruitées et ne sont précises qu'avec une certaine probabilité. Sous des hypothèses appropriées, on montre que le nouvel algorithme converge presque sûrement vers un point stationnaire du premier ordre. Notre approche est appliquée à une instance dans l'assimilation de données variationnelles où les modèles stochastiques du gradient sont calculés par le lisseur de Kalman d'ensemble (EnKS). On montre la convergence dans L^p de l'EnKS vers le lisseur de Kalman, quand la taille de l'ensemble tend vers l'infini. On montre aussi la convergence de l'approche LM-EnKS, qui est une variante de l'algorithme de LM avec l'EnKS utilisé comme solveur linéaire, vers l'algorithme classique de LM ou le sous problème est résolu de façon exacte. La sensibilité de la méthode de décomposition en valeurs singulières tronquée est étudiée. Nous formulons une expression explicite pour le conditionnement de la solution des moindres carrés tronqués. Cette expression est donnée en termes de valeurs singulières de A et les coefficients de Fourier de b. ABSTRACT : The Levenberg-Marquardt algorithm (LM) is one of the most popular algorithms for the solution of nonlinear least squares problems. Motivated by the problem structure in data assimilation, we consider in this thesis the extension of the LM algorithm to the scenarios where the linearized least squares subproblems, of the form min||Ax - b ||^2, are solved inexactly and/or the gradient model is noisy and accurate only within a certain probability. Under appropriate assumptions, we show that the modified algorithm converges globally and almost surely to a first order stationary point. Our approach is applied to an instance in variational data assimilation where stochastic models of the gradient are computed by the so-called ensemble Kalman smoother (EnKS). A convergence proof in L^p of EnKS in the limit for large ensembles to the Kalman smoother is given. We also show the convergence of LM-EnKS approach, which is a variant of the LM algorithm with EnKS as a linear solver, to the classical LM algorithm where the linearized subproblem is solved exactly. The sensitivity of the trucated sigular value decomposition method to solve the linearized subprobems is studied. We formulate an explicit expression for the condition number of the truncated least squares solution. This expression is given in terms of the singular values of A and the Fourier coefficients of b
Convergence and Complexity Analysis of a Levenberg–Marquardt Algorithm for Inverse Problems
The Levenberg–Marquardt algorithm is one of the most popular algorithms for finding the solution of nonlinear least squares problems. Across different modified variations of the basic procedure, the algorithm enjoys global convergence, a competitive worst-case iteration complexity rate, and a guaranteed rate of local convergence for both zero and nonzero small residual problems, under suitable assumptions. We introduce a novel Levenberg-Marquardt method that matches, simultaneously, the state of the art in all of these convergence properties with a single seamless algorithm. Numerical experiments confirm the theoretical behavior of our proposed algorithm
Ensemble DNN for Age-of-Information Minimization in UAV-assisted Networks
This paper addresses the problem of Age-of-Information (AoI) in UAV-assisted
networks. Our objective is to minimize the expected AoI across devices by
optimizing UAVs' stopping locations and device selection probabilities. To
tackle this problem, we first derive a closed-form expression of the expected
AoI that involves the probabilities of selection of devices. Then, we formulate
the problem as a non-convex minimization subject to quality of service
constraints. Since the problem is challenging to solve, we propose an Ensemble
Deep Neural Network (EDNN) based approach which takes advantage of the dual
formulation of the studied problem. Specifically, the Deep Neural Networks
(DNNs) in the ensemble are trained in an unsupervised manner using the
Lagrangian function of the studied problem. Our experiments show that the
proposed EDNN method outperforms traditional DNNs in reducing the expected AoI,
achieving a remarkable reduction of .Comment: 6 pages, 3 figure
A Subsampling Line-Search Method with Second-Order Results
In many contemporary optimization problems such as those arising in machine
learning, it can be computationally challenging or even infeasible to evaluate
an entire function or its derivatives. This motivates the use of stochastic
algorithms that sample problem data, which can jeopardize the guarantees
obtained through classical globalization techniques in optimization such as a
trust region or a line search. Using subsampled function values is particularly
challenging for the latter strategy, which relies upon multiple evaluations. On
top of that all, there has been an increasing interest for nonconvex
formulations of data-related problems, such as training deep learning models.
For such instances, one aims at developing methods that converge to
second-order stationary points quickly, i.e., escape saddle points efficiently.
This is particularly delicate to ensure when one only accesses subsampled
approximations of the objective and its derivatives.
In this paper, we describe a stochastic algorithm based on negative curvature
and Newton-type directions that are computed for a subsampling model of the
objective. A line-search technique is used to enforce suitable decrease for
this model, and for a sufficiently large sample, a similar amount of reduction
holds for the true objective. By using probabilistic reasoning, we can then
obtain worst-case complexity guarantees for our framework, leading us to
discuss appropriate notions of stationarity in a subsampling context. Our
analysis encompasses the deterministic regime, and allows us to identify
sampling requirements for second-order line-search paradigms. As we illustrate
through real data experiments, these worst-case estimates need not be satisfied
for our method to be competitive with first-order strategies in practice
Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization
In this paper, we propose a new zero order optimization method called
minibatch stochastic three points (MiSTP) method to solve an unconstrained
minimization problem in a setting where only an approximation of the objective
function evaluation is possible. It is based on the recently proposed
stochastic three points (STP) method (Bergou et al., 2020). At each iteration,
MiSTP generates a random search direction in a similar manner to STP, but
chooses the next iterate based solely on the approximation of the objective
function rather than its exact evaluations. We also analyze our method's
complexity in the nonconvex and convex cases and evaluate its performance on
multiple machine learning tasks